{
 "metadata": {
  "name": "Limit_Theorems"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h1>Limit Theorems</h1>\n",
      "Consider a sequence $X_1, X_2, \\ldots \\hspace{1pt}$ of independent i.i.d. random variables with mean $\\mu\\hspace{1pt}$ and variance $\\sigma^2\\hspace{1pt}$. Let \n",
      "\n",
      "$$S_n = \\sum_{i=1}^n X_i$$\n",
      "\n",
      "be a partial sum of the random variables. By **independence** we have\n",
      "\n",
      "$$var\\left(S_n\\right) = \\sum_{i=1}^n var\\left(X_i\\right) = n \\sigma^2$$\n",
      "\n",
      "We now define a new random variable, called the **sample mean** given by \n",
      "\n",
      "$$M_n = \\frac{1}{n}\\sum_{i=1}^n X_i = \\frac{S_n}{n}$$\n",
      "\n",
      "which has expected value and variance\n",
      "\n",
      "$$E\\left[M_n\\right] = \\mu \\quad var\\left(M_n\\right) = \\frac{\\sigma^2}{n}$$\n",
      "\n",
      "Notice that the variance of the sample mean decreases to zero as *n* increases, implying that most of the probability distribution for $M_\\hspace{1pt}$ is close to the mean value. \n",
      "\n",
      "We also introduce the random variable \n",
      "\n",
      "$$Z_n = \\frac{S_n - n\\mu}{\\sigma \\sqrt{n}}$$\n",
      "\n",
      "for which \n",
      "\n",
      "$$E\\left[Z_n\\right] = 0 \\quad var\\left(Z_n\\right) = 1$$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h2>Markov Inequality</h2>\n",
      "If a RV *X* can only take nonnegative values, then\n",
      "\n",
      "$$P\\left(X \\ge a \\right) \\le \\frac{E\\left[X\\right]}{a} \\quad \\forall a \\gt 0$$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h2>Chebyshev Inequality</h2>\n",
      "If *X* is a RV with mean $\\mu \\hspace{1pt}$ and variance $\\sigma^2\\hspace{1pt}$, then\n",
      "\n",
      "$$P\\left(\\left| X - \\mu \\right| \\ge c \\right) \\le \\frac{\\sigma^2}{c^2} \\quad \\forall c \\gt 0$$\n",
      "\n",
      "An alternative form of the Chebyshev inequality is obtained by letting $c=k\\sigma\\hspace{1pt}$ where *k* is postive. This gives\n",
      "\n",
      "$$P\\left(\\left| X - \\mu \\right| \\ge k\\sigma \\right) \\le \\frac{1}{k^2} $$\n",
      "\n",
      "which indicates that the probability of an observation of the random variable *X* being more than *k* standard deviations from the mean is less than or equal to $$1/k^2\\hspace{1pt}$. "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h2>Weak Law of Large Numbers</h2>\n",
      "Let $X_1, X_2, \\ldots \\hspace{1pt}$ be i.i.d. RVs with mean $\\mu\\hspace{1pt}$. For **every** $\\epsilon > 0 \\hspace{1pt}$ \n",
      "\n",
      "$$\\lim_{n\\rightarrow \\infty} P\\left(\\left|M_n - \\mu \\right| \\ge \\epsilon \\right)= 0$$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h2>Convergence in Probability</h2>\n",
      "Let $Y_1, Y_2, \\ldots \\hspace{1pt}$ be a sequence of RVs, *not necessarily independent*, and let *a* be a real number. We say that the sequence $Y_n \\hspace{1pt}$ **converges to** *a* **in probability** if for every $\\epsilon \\gt 0 \\hspace{1pt}$ we have\n",
      "\n",
      "$$\\lim_{n\\rightarrow 0} P\\left( \\left| Y_n -a \\right|  \\gt \\epsilon \\right) = 0$$\n",
      "\n",
      "This implies that the probability distribution of the random variables, $Y_n \\hspace{1pt}$ converges to a distribution that is contained within a space of width $2\\epsilon\\hspace{1pt}$ around the point *a*. However this says nothing about the shape of the distribution. \n",
      "\n",
      "This can be rephrased in the following way: For every $\\epsilon \\gt 0 \\hspace{1pt}$ and for any $\\delta \\gt 0 \\hspace{1pt}$, there exists $n_0 \\hspace{1pt}$ such that\n",
      "\n",
      "$$ P\\left( \\left| Y_n -a \\right|  \\gt \\epsilon \\right) \\le \\delta \\quad \\forall n \\ge n_0$$\n",
      "\n",
      "where $\\epsilon \\hspace{1pt}$ is known as the **accuracy** and $\\delta \\hspace{1pt}$ is known as the **confidence**. "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h2>The Central Limit Theorem</h2>\n",
      "Let $X_1, X_2, \\ldots\\hspace{1pt}$ be a sequence of i.i.d. random variables with common mean $\\mu\\hspace{1pt}$ and variance $\\sigma^2\\hspace{1pt}$ snd define\n",
      "\n",
      "$$Z_n = \\frac{1}{\\sigma \\sqrt{n}} \\left[\\sum_{i=1}^n X_i - n\\mu\\right]$$\n",
      "\n",
      "Then, the **CDF** of $Z_n\\hspace{1pt}$ converges to the standard normal CDF\n",
      "\n",
      "$$\\Phi\\left(z\\right) = \\frac{1}{\\sqrt{2\\pi}} \\int_{-\\infty}^z e^{-x^2/2}dx$$\n",
      "\n",
      "in the sense that \n",
      "\n",
      "$$\\lim_{n\\rightarrow \\infty} P\\left(Z_n \\le z \\right) = \\Phi\\left(z\\right)$$\n",
      "\n",
      "Note that there is an implicit assumption that the **mean and variance**, $\\mu\\hspace{1pt}$ and $\\sigma^2\\hspace{1pt}$, **are finite**. This does not hold for certain power law distributed RVs."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h3>Approximation of Probability of Sum of RVs</h3>\n",
      "Let $S_n = X_1 + \\ldots + X_n\\hspace{1pt}$, where the $X_i\\hspace{1pt}$ are i.i.d. RVs each with mean $\\mu\\hspace{1pt}$ and variance $\\sigma^2\\hspace{1pt}$. If *n* is large, the probability $P\\left(S_n \\le c \\right)\\hspace{1pt}$ can be approximated by \n",
      "\n",
      "1. Calculate $z = \\left(c - n \\mu\\right)/\\sigma\\sqrt{n}\\hspace{1pt}$\n",
      "\n",
      "2. Use the approximation\n",
      "\n",
      "$$P\\left(S_n \\le c \\right) \\approx \\Phi\\left(z\\right)$$\n",
      "\n",
      "where $\\Phi\\left(z\\right)\\hspace{1pt}$ is available from standard normal CDF tables."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<h3>Approximation to the Binomial</h3>\n",
      "If $S_n\\hspace{1pt}$ is a binomial RV with parameters *n* and *p*, with large *n*, and *k, m* are nonnegative integers, then\n",
      "\n",
      "$$P\\left(k \\le S_n \\le m \\right) \\approx \\Phi\\left( \\frac{m + \\frac{1}{2} -np}{\\sqrt{np\\left(1-p\\right)}}\\right) - \n",
      "\\Phi\\left(\\frac{k - \\frac{1}{2} -np}{\\sqrt{np\\left(1-p\\right)}}\\right)$$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}